Reconstructing Websites for the Lazy Webmaster
نویسندگان
چکیده
Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, “lazy” webmasters or concerned third parties may be able to recover some of their website from the Internet Archive. Other pages may also be salvaged from commercial search engine caches. We introduce the concept of “lazy preservation”digital preservation performed as a result of the normal operations of the Web infrastructure (search engines and caches). We present Warrick, a tool to automate the process of website reconstruction from the Internet Archive, Google, MSN and Yahoo. Using Warrick, we have reconstructed 24 websites of varying sizes and composition to demonstrate the feasibility and limitations of website reconstruction from the public Web infrastructure. To measure Warrick’s window of opportunity, we have profiled the time required for new Web resources to enter and leave search engine caches.
منابع مشابه
Website Reconstruction using the Web Infrastructure
Backup or preservation of websites is often not considered until after a catastrophic event has occurred. In the face of complete website loss, webmasters or concerned third parties may be able to recover some of their website from the Internet Archive. Other pages may also be salvaged from commercial search engine (SE) caches if caught in time. We introduce the concept of “lazy preservation”di...
متن کاملWebsite Forensic Investigation to Identify Evidence and Impact of Compromise
Compromised websites that redirect users to malicious websites are often used by attackers to distribute malware. These attackers compromise popular websites and integrate them into a drive-by download attack scheme to lure unsuspecting users to malicious websites. An incident response organization such as a CSIRT contributes to preventing the spread of malware infection by analyzing compromise...
متن کاملAn Introduction to Implicit Invocation Architectures
ColdFusion's initial appeal was to "webmasters" who wanted to make their sites more dynamic. It succeeded admirably. But just as the term, webmaster, is an anachronism, the call for more dynamic websites has been succeeded by the need for true web applications. As these applications become more involved and more ambitious in scope, ColdFusion developers find that a thorough knowledge of tags an...
متن کاملAna and the Internet: a review of pro-anorexia websites.
OBJECTIVE The purpose of this article is to describe the content of pro-anorexia websites, both qualitatively and quantitatively. METHOD An Internet search protocol was developed to identify pro-anorexia websites. A grounded theory approach was used to generate themes from Internet-based information. Basic descriptive analysis was employed to report on key website characteristics. RESULTS T...
متن کاملWebsites of Indian Institutes of Technology: a Webometric Study
The study explored different characteristics of linking analysis of sixteen IIT websites. All the IITs have their own websites and all websites working under homogeneous Domain Name System (DNS) “.ac.in”. The comparisons of ranking of Indian Institutes of Technology (IITs) have been done using WISER, WIF (inlink) and World Rank. The WISER ranking and WIF (in-link) is having correlation i.e. +0....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/cs/0512069 شماره
صفحات -
تاریخ انتشار 1984